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Abstract 

Lecture notes of a block course explaining why quantum field theory might be in a 
better mathematical state than one gets the impression from the typical introduction 
to the topic. It is explained how to make sense of a perturbative expansion that fails 
to converge and how to express Feynman loop integrals and their renormalization 
using the language of distribtions rather than divergent, ill-defined integrals. 

1 Introduction 

Physicists are often lax when it comes to mathematical rigor and use objects that 
do not exist according to strict mathematical standards or happily exchange limits 
without justification. This different culture of "everything is allowed as long as it is 
not proven to be wrong and even then it sometimes ok because we do not actually 
mean what we are writing" is preferred by many as it allows to "focus on the content 
rather than the formal aspects" and to progress at a much faster pace. 

This attitude can be seen when physicists talk about quantum mechanics and 
treat operators as if they were matrices and plane waves as if they are elements 
of the relevant Hilbcrt space. This is generally accepted since one has the feeling 
that these arguments can easily be repaired at the expense of clarity by talking 
about wave packets instead of plane waves and (like it is discussed at length in 
our "Mathematical Quantum Mechanics" course) by talking about quadratic forms 
instead of the operators directly. 

The situation appears to be very different in the case of quantum field theory: 
There, most of the time, one deals with perturbative series expansions in the cou- 
pling constant without thinking about convergence (or if one spends some thought 
on this one easily sees that the radius of convergence has to be zero) and the indi- 
vidual terms in the series turn out to be divergent and one obtains reasonable, finite 
expressions after some very doubtful formal manipulations (often presented as sub- 
tracting infinity from infinity in the "right way"). The typical QFT course, unlike 
quantum mechanics above, does not indicate any way to "repair" these mathemati- 
cal shortcomings. Often, one is left with the impression that there is some blind faith 
required on the side of the physicists or at least that some black magic is helping 



to obtain numerical values that fit so impressively what is measured in experiments 
from very doubtful expressions. 

In these notes we will indicate some ways in which these treatments can be 
made more exact mathematically thus providing some cure to the mathematical 
uneasiness related to quantum field theory. In particular, we will argue that QFT is 
not "obviously wrong" as claimed by some mistakenly confusing mathematical rigor 
with correctness. 

Concretely, we want to explain how two (mostly independent) crucial steps in 
QFT can be understood more mathematically: 

In a simplified example, we will explore what conclusions can be drawn from the 
perturbative expansion even though the series does not converge for any finite value 
of the coupling constant. In particular we will discuss the role of non-perturbative 
contributions like instantons in the full interacting theory. We will find that up to a 
certain level of accuracy (depending on the strength of coupling), the first terms of 
the perturbative expansion do represent the full answer even though summing up all 
terms leads to infinite, meaningless expressions. Furthermore, at least in principle, 
using the technique of "Borel resummation" one can express the true expression for 
all values of the coupling constant in terms of just the perturbative expansion. 

As a second step, at each order in perturbation theory, we will see how by cor- 
rectly using the language of distributions one can set up the calculation of Feynman 
diagrams without diverging momentum integrals. We will find that these diver- 
gences can be understood to arise from trying to multiply distributions. We will set 
this up as the problem to extend distributions from a subset of all test functions at 
the expense of a finite number of undetermined quantities that we will identify as 
the "renormalized coupling constants" . Finally we will understand how these vary 
when we change regulating functions that were introduced in the procedure which 
leads to an understanding of the renormalization group in this formalism of "causal 
perturbation theory" . 

The aim is to argue how the techniques of physicists could be embedded in a 
more mathematical language without actually doing this. At many places we just 
claim results without proof or argue by analogy (for example we will discuss a one 
dimensional integral instead of an infinite dimensional path-integral). To really 
discuss the topic at a mathematical level of rigor requires a lot more work and to 
large extend still needs to be done for theories of relevance to particle physics. 

All this material is not new but well known to experts in the field. Still, we hope 
that these notes will be a useful complement to standard introductions to quantum 
field theory for (beginning) practitioners. 
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2 Perturbative expansion — making sense of divergent series 

Before we take a look at divergent series, we will first give a brief review of how 
perturbative expansion is used in quantum field theory. 



2.1 Brief overview on path integrals 

A quantum field theory in Minkowski spacetime is described by a Lagrangian density 
C(<p,d<p) and a generating functional of correlation functional 

Z[J] = J V4>e l S dix{c+J,p) . (1) 

The correlation functions can be obtained by functional derivatives of (JlJ with re- 
spect to J. 

MsxWoa) . . . = ^ (-*sj^)) {-tjj^j) ■ ■ ■ Z[J] , 7 o 

IJ (2) 

In this lecture we will use Euclidean signature for the metric instead of Minkowski. 
The change between the metrics can be performed as rotation of the time axis in 
the complex plane t — > —ir if all expressions are analytic. In Euclidean metric, 
the exponent in the generating functional is real and falls of at large field values. 
This gives the path integral a chance to have a mathematical definition in terms of 
Wiener measures but that will not concern us in these notes. 

Z[J]= f V4^ A *< C+J ^ (3) 



In general, the integral ^ cannot be computed exactly. For a scalar quantum field 
theory in Euclidean space the Lagrangian has the form 

£=^(D-m 2 )0-y(0) (4) 

with □ = (<9 T ) 2 + (V) 2 . 

If the potential V(4>) vanishes, equation Q can be formally computed as it 
becomes an integral of Gaussian type. One therefore arbitrarily splits the Lagrangian 
into its "kinetic part" !</>(□ — m 2 )</> and its "interaction part" —V(<f>). 

Z[J] = J Vcfyeh S ^^{n-rr, 2 )^^ S d^xv^) e - J a l xJ4, 



To obtain the Gaussian integral one has to complete the square in the exponent. 
This is achieved by shifting the field <fi: 

(/)' = </>+(□ -to 2 )- 1 J (6) 



1 This subsection displays some standard expressions to set the context. For many more details see 
for example [T]. 
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The inverse of □ — to 2 , called "Green's function" G(x — y), is a distribution defined 

by 

{U-m 2 )G{x-y) = 8{x-y). (7) 
Changing variables in the functional integral |5]) leads to 

Z[J] = e -S^yirj) f D<j)' e-f dix y (V~rn 2 )4,' e - J d*x J d i y^J(x)G{x-y)J(y) _ ^ 



The complicated expression in the middle of equation ^ does not depend on J and 
will in fact cancel out in equation ^ for the correlation function, so we will just 
denote it C and forget about it: 

Z[J] = C e -/d 4 a; y(^) e -/d 4 a ; /d 4 yij( a; )G(x-y)J(y) 

Now let us take a look on a specific example for a quantum field theory by 
choosing a potential for the scalar field. We will consider our favorite </> 4 theory 
given by the potential 

V(4>) = A^ 4 . (9) 

The next step is to insert this potential in equation Q and write the exponential 
as a power series in the coupling strength A. 

(10) 

We now found an expression for any general correlation function in terms of an 
power series expansion in the coupling strength. 

IA( \ At A\ 6 _5^(~\) k f Ai <5 4 f 4 S 4 



e -Jd i xJd i V ^.J(x)G(x-y)J(y) 



(11) 

,7=0 



The combinatorics of the occurring expressions in terms of integrals over interaction 
points Xi, Green's functions and external fields can be summarized in terms of 
Feynman diagrams each standing for a single term in the power series in the coupling 
constant In the following, we want to study the convergence behavior of this 
power series. 



2.2 Radius of convergence of correlation functions 

Let us briefly review the definition of the radius of convergence for a power series 
from introductory analyis. It is useful to think of a power series to be defined in the 
complex plane: 

oo 

]TA & (...) AeC (12) 

k 

2 The careful reader wishing to avoid ill-defined expressions using path-integrals, can use this formula 
as the definition of the terms in the perturbative series. 
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(If one does not like the idea of a complex coupling strength in a quantum field 
theory, just restrict to the special A G (C that happen to be real.). Every power 
series has a radius of convergence R € [0, oo] such that 



J converges V|A|<i? 
diverges V |A| > R. 



Now we want to find out the radius of convergence for the correlation functions 
(111 in a quantum field theory. A physicist's argument was given by Freeman Dyson 
in 1952 [2]. Let us take a look on the potential, for example in our (f> 4 theory as 
shown in figure [T] For positive coupling strength A the potential is bounded from 
below and large values of <f> are strongly disfavored. This behavior, however, gets 
radically different in case of a negative A. The potential becomes unbounded from 
below and the field (f> will want to run off to 4> — ±oo. Obviously, such a behavior is 
highly unphysical, since ever increasing values of <p would lead to an infinite energy 
gain. It is thus clear that such a theory cannot lead to healthy correlation functions, 
in other words for any negative A the power series (12 1 will diverg^J From this we 



can conclude the radius of convergence being R = 0! 

~A fc (...) diverges VA > (14) 



oo 




For readers not satisfied by this argument using physics of unstable potentials 
for determining the radius of convergence, let us mention an alternative line of 
argument. Again, consider equation (11), this time, however, we will focus on the 
Feynman diagrams. At any order k in the perturbation expansion there is a sum 
of different Feynman diagrams expressing the integrals in (11), where k counts the 
number of vertices. The combinatorics of all Feynman diagrams shows that the 



We expect at least a phase transition when A is changed from positive to negative values. 
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number of Feynman diagrams grows like fc!. The power series, therefore, will behave 
like 

OO 

]TA fc fc!(...). (15) 

fc 

Assuming that (. . .) is not surprisingly suppressed for large fc, the coefficients of X k 
grow faster than any power, we again find the radius of convergence R = 0. 

In the following, we want to give an example, how one can nevertheless make 
sense of (some) divergent power series. 

2.3 Non-perturbative corrections 

In order to get a feeling for the problem of divergent power series, we will consider 
a one dimensional toy problem (rather than the infinite dimensional problem of a 
path integral): 

/■OO 

Z(X) = / Axe~ x2 ' Xxi (16) 



We take A > 0, so this integral yields some finite, positive number. For A = the 
solution is well known 

Z(0) = V*. (17) 



In general, equation ( 16 1 can be expressed in terms of special functions, e.g. Math- 



ematica gives the solution 

z W = ^P> (18, 

with K n (x) being the modified Bessel function of the second kind. We call solution 



( 18 1 the "full, non-perturbative answer" . Now we will do the same as in quan- 
tum field theory and split the integral into a "kinetic" and an "interaction" part, 
respectively. 

2.3.1 Treating the toy model perturbatively 

Following the same procedure, we will again expand the "interaction part" —Ax 4 in 
a power series: 

/oo poo °° / \~i\k 

& xe ~* 2 -^ = / dxe~* 2 J2 (19) 

Now comes the crucial step and "root of all evil" . Following precisely the same steps 



leading towards equation (11) for correlation functions in quantum field theory, we 
will change the order of integration and summation, leading to the interpretation of 
a power series of Feynman diagrams: 



oo 



(-A)* 



^ fc! 

k=0 



Z ( A )" = ">.StW dxx ik e-* (20) 



From this step, as we will see later, the problems arise. Although this step is 
forbidden (as roughly speaking, we are changing the behavior of the integrand at x = 



6 



±oo), we are interested in to what extent a "perturbative solution" obtained from 



equation (20) will agree with the full, non-perturbative solution (18 1. Carrying on 



we observe that the integral in ( 20 1 is now of the type "polynomial times Gaussian" 



and can be computed with standard methods. We smuggle an addtional factor a 
into the exponent allowing us to write the integrand as derivatives of e~ ax with 
respect to a at the point a = 1. 



Z{\) 



fe=0 



Ax 



d 



2k 



da 2k 



fc=0 



(-X) k & 
k\ da 2k 



Of course we can easily evaluate the derivatives: 



d 2k 

da 2k 



a=X 



1 3 5 7 9 11 
22 ' 22 ' 2T 



(21) 



(22) 



total of 2k factors 



In order to find an explicit expression for |22j) one can insert factors of 1 between 
all factors, such that the nominator becomes (4fc)!: 
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2 k 



Q a 2k 



1 2 345 6 789 10 11 12 

2 2 2 4 2 6 2 8 2 10 T 12 



(4fc)! 1 1 1 1 1 1 
2 2k 2 468 10 12 



total of 4k factors 



total of 2k factors 



(4fc)! 



1 



(4k) 



2 2k 2 2k (2k)\ 2 4k {2k)\ 



Thus we obtain the "perturbative solution" of problem ( 16 1 

: (-A) fc (4fc)! 



k=0 



2 4fc (2fc)!fc! 



(23) 



(24) 



Let us take a closer look at this expression. By observing that the denominator of 
the summand eventually contains smaller factors than the nominator for all k larger 
than a critical integer, we can realize that the series is divergent. More carefully we 
can apply Stirling's formula n\ ps \/2im (— ) for large values of k: 



(4k)\ 



2 Ak (2k)\k\ 
We already know that the sum 



\J nk 



1 



V2ir 



4 k k\ 



(25) 



J2(-4X) k k\ 



k=0 



(26) 



will diverge. This shows that the power series (24) is divergent and in particular it 



is not the finite number that we are looking for as an expression for ( 16 ) 



2.3.2 The perturbative and the full solution compared 

Even though the perturbative series will diverge, we want to study its numerical 
usefulness at finite order. After all, one usually computes only a finite number 
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of Feynman diagrams to obtain only the first few summands of the perturbative 



from (24 1? Let us choose one value for A, e.g. and evaluate (|18| numerically: 



expan sion . Is there a way to approximate the full, non-perturbative solution (jlS I 

50 

1 



Z\ — ] = 1.7478812. 



(27) 



For the same value of A the evaluation of the first few terms of the infinite sum ( 24 1 

N 



k=0 



(~A) fc (4fc) 
2 ik {2k)\k\ 



gives 



Z 5 
-2-10 



1.7478728 . 
1.7478818. 



(28) 



(29a) 
(29b) 



The first terms of the perturbative solution agree up to six digits! We can use 
conveniently a figure for plotting higher orders of the perturbative series. Figure [2] 
shows that the perturbative solution gets in a certain regime very close to the result 
of the full solution, before the series starts to diverge. We can use a figure as well 



1.74820 
1.74815 
1.74810 
1.74805 
1.74800 
1.74795 
1.74790 
1.74785 



(1 
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40 



Figure 2: Values of the perturbative series (24) evaluated to order N 



to compare the solutions for variable A. Figure [3] shows nicely the non-perturbative 
solution and compares it to the perturbative solution for orders of one to twelve. 
We can see that at some point all approximations given by the perturbative solution 
will disagree strongly from the full solution! 

The question that arises is, how long does the perturbative solution become 
better before it starts to diverge? Obviously, the fact that it approximates the 
non-perturbative solution to high precision leads to the great success of quantum 
field theory, even if for higher orders the series diverges! As we will see now, the 
perturbative solution ( p4| is a good approximation as long as we only consider 
terms up to order N = O(j). Remembering the dimensionless coupling strength of 
Quantum Electrodynamics being the Sommerfeld finestructure constant a 



137 



we 



can be ensured that perturbation theory will lead to great precision given that the 
most elaborate QED calculations for (g — 2) are to order N = 7! 
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A 

0.00 0.05 0.10 0.15 0.20 0.25 0.30 

Figure 3: Z{\) obtained from the full solution (thick) and first approximations from the 
perturbative series 



2.3.3 The method of steepest descend 

But what is the origin of the eventual divergence and complete loss of numerical 
accuracy? It turns out that there are "non-perturbative" terms that do not show up 
in a Taylor expansion but that become dominant when the pert urb ative expansion 

9 2 I I 

breaks down. To see this, let us substitute x = t- in equation (16): 

1 f v?+u i 

Z(X) = ^J due — (30) 

The exponent is strictly negative and its absolute value becomes very large in the 
limit of small A. This allows to perform the method of steepest descent: The main 
contribution to the integral, as A — > comes from the extrema of the integrand 

u 2 +u\ (31) 

In general the method works as follow: For A — >■ oo we want to solve an integral of 
the general form 

\\xA{x)e {€i< >' (x)h . (32) 

One expands now around its extreme^ <fi'(xo) — and obtains again an integral of 
"Gaussian times polynomial" typ^] 

= J &x{A{x ) + {x-x Q )A\x Q )...)e A ^ Xo ^ x - x ^ 2 ^'^ + ---^ 

x :cj>' (x )=0 



£ ^-*V^)a( i+o (1))' 



x :<f>' Oo)=0 



4 Notice that in field theory 4>'( x ) — is the equation of motion 
Corrections from (x — xo)A'(xo) can be obtained by doing again the trick of smuggling an a into the 
exponent and write the term as derivative with respect of a evaluated at a = 1. 
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In our case, the extrema of (31 1 are u = and u = ±i/\/2. Expansion around the 
first yields the perturbative expansion of above. The other two yield contributions 
like that are invisible to a Taylor expansion around A = 0, as all derivative 
vanish here. We have found an example of a "non-perturbative contribution" . 

The perturbative solution, however, gives meaningful results, as long as its terms 
are bigger than to the non-perturbative contributions. This allows an estimate, to 
what order in the perturbative series the expansion around A = dominates. This 
happens also to be the order at which the divergence from the exat solution starts 
as we are missing the non-perturbative terms: 

e"& « A fe 



X 



(34) 



We have seen that the perturbative analysis of ( 30 1 requires as well expansions 
around the other extrema, besides A = 0! Combining all power series together, 
the resulting perturbative solution has a chance to converge. Before we continue to 
the mathematical discussion of the problem how finite results can be obtained from 
divergent series, we will take a look on examples of non-perturbative contributions 
in physics. 



2.3.4 Instantons 



"Field configurations" contributing are called "instantons". Usually these 

contributions are hard to calculate, in some situations, however, one can find the 
result. Consider for example a gauge theorjj^ 



S = J £ = J i tv(F A *F) 



(35) 



The stationary point we use for the expansion is given by the equations of motion 



dF 
d*F 



(36a) 
(36b) 



The first equation ( 36a I is automatically fulfilled once we express the field-strength 
in terms of a vector potential F — dA. The second ( 36b I is automatically solved if 
it happens that 

F = *F. (37) 



One calls solutions to (37) instantons. In terms of the vector potential A (37) is a 
first order partial differential equation as compared to (36b) which is second order. 
One can easily see that there exist no solution in Lorentzian metric as the Hodge 
star squares to —1 on 2-forms. 



F = *F = **F = —F. 



(38) 



In Euclidean metric, however, such solutions exist because of * * F = F. As it 
turns out (as one can for example argue using the Atiya-Singer index theorem), for 



'Hodge * operator: = e A "" TT F CTT . More details can be found in chapters 1.10 and 10.5 of [3]. 
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a compact manifold M, the action in the instanton case yields an integer (up to a 
pre-factor) : 

tx{F A F) e 8tt 2 % (39) 



This leads to 



I M 

g 4 r /tr(FA*F) = e 4,/tr(FA_F) = ^ jl^ 2 N 



(40) 



2.3.5 Dual theories 



Sometimes a quantum field theory with coupling constant A can be rewritten in terms 
of another (or possibly the same) quantum field theory with coupling A = 1/A. One 
calls such a relation between two theories a "duality" . In many examples, such 
theories arise from string theory constructions, where the coupling A can be given 
a geometric meaning. Imagine for example a problem of a quantum field theory on 
a torus. A torus can be viewed as <D/(Z + tK) with r g (D\R. The torus has a 
basis of two non-contractible circles, one that goes along the real axis from to 1 
and one that goes from to r. This choice of basis, however, is not unique: For 
example, swapping these cycles corresponds to a substitution r — > —1/r. If the 
torus parameter r is identified with the coupling strength a duality has been found 
since both r and — 1/r describe geometricaly the same torus! To make contact with 
our discussion above, we should identify A with the imaginary part of r. The duality 
allows for a Taylor expansion of the non-perturbative contributions via 

~\ k 



Troublesome terms in one theory are therefore perfectly defined in the dual theory. 
The caveat however is the difficulty of actually proving that A — » 1/ A is a symmetry 
of the quantum field theory at hand. 



2.4 Asymptotic series and Borel summation 

In the following, we take a look on the mathematical situation of asymptotic series. 
This discussion is based on chapter XII of [3]. 

Definition 1 Let f : R>o — > C The series a ™ z ™ * s called asymptotic to f as 
- \ " 'If 

VN e N : lim /(z) - ^" = (42) 

2\,o z n y ' 

For z£(Cfl analog definition is possible. 

Obviously, every function can have at most one asymptotic expansion. This can be 
seen by assuming two asymptotic expansions a n and a n . (42) requires that a n = a n . 
Otherwise, let n be the smallest index for which a n ^ a n and 

Efc( ffl fe - a k )z k . i 

hm = a n - a„ = 0. (43) 

The other way around is not true, as can be seen by f(z) = e~* and f(z) = having 
both the asymptotic series O'Z • This means that knowing the asymptotic series 
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of a function tells us nothing about f(z) for a non vanishing z, we only know how 
f(z) approaches f(Q) as z \ 0. 

We try to find a stronger definition of an asymptotic series, allowing us to 
uniquely recover one function. The following theorem helps us to find the necessary 
conditon: 

Theorem 2 ( Carleman's theorem) Let g be an analytic function in the interior 

of S = {z £ <D| \z\ < B, |arg z\ < ^} and continous on S. If for all n G N and z G S 

_ j_ 

we have \g{z)\ < b n \z\ and b n n = oo, then g is identically zero. 

A simpler special case of the theorem is found by considering g an analytic func- 
tion in the interior of S e = {z € <D| \z\ < R, |arg z\ < § + e} for some e > and 
continous on S e . If there exist C and B so that \g(z)\ < CB n n\ \z\ n Vz G S and Vn, 
then g is identically zero. 

In order to find a unique function for an asymptotic series, we use Carleman's 
theorem to define "strong asymptotic series" . 

Definition 3 Let f be an analytic function on the interior of S e = {z G <D| \z\ < 
R, |arg z\ < | +e} — > R. The series a nZ n is a strong asymptotic series if there 
exist C, a so that ViV £ N, z £ S e the strong asymptotic condition 

N 



f(z)-J2 a nZ T 



< C(7 Ar+1 (7V + l)!|zr +1 (44) 



is fulfilled. 



This means, if we are given a strong asymptotic series, we can recover by theorem 
[2] the function! Assume for example a n z n is a strong asymptotic series for two 
functions / and g, respectively. Then 

\f(z) - .g(2)| < 2Co- N+1 (N + 1)! \z\ N+1 =>f = g (45) 



The strong asymptotic condition (44 1 implies \a n \ < Ca n n\. This is precisely the 



growth behavior of ( 24 ) we found in our toy example, where C = ^= and a = 4. The 



necessary conditions, therefore, are fulfilled in our toy model (assuming analyticity 
away from of course). 

By now, we learned that a strong asymptotic series (in particular the type we 
obtain in quantum field theory) although not converging has the chance to be a 
unique approximation to one function. The final question is, how one can obtain this 
function / from its strong asymptotic series. In the last theorem we introduce the 
method of "Borel summation" to obtain a final result. We can define a convergent 
series by taking out a factor of n! from the coefficients: 

Theorem 4 (Watson's theorem) If f : S e — > M has a strong asymptotic series 
^^°a„z™, we define the Borel transform 

00 



(z) = £ (46) 



The Borel transform converges for \z\ < tK. We obtained a convergent power series 
with finite radius of convergence, which, as it turns out, can be analytically continued 
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to all complex z € (D imi/i |arg z| < e. T/ien the function f is given by the Laplace 
transform 

/"OO 

f{z) = / d& 5 (6z)e- 6 . (47) 



o 



This Laplace transform is called "inverse Borel transform" and the method outlined 
here is known as "Borel summability method" . It describes how to obtain a finite 
answer from divergent series, that is formally a sum for the series. 

Let us make a sanity check. Using / °° dx x k e~ x — k\ we can plug the definition 



of the Borel transform into ( 47 ) , formally interchange the sum and the integration 
and obtain 

/•oo />oo 

/(*)=/ dbg(bz)e- b = diV^Ve-^'-V^/. (48) 
J o Jo n n - 

So at least for analytic functions we do recover the original function. 
2.5 Summary 

We have learned why for N < 0{\) the sum of the first N terms of the perturbation 
expansion is numerically good, even when the original series a n z n = oo diverges. 
This way we approximate the true function up to instantonic terms of the order of 
e 1 /-^ which a Taylor expansion cannot resolve. Given that the coefficients a n obey 
the strong asymptotic condition \a n \ < Co~ n n\, which is usually the case when 
using Feynman diagrams, the Borel transform exists and one can compute the Borel 
summation. Unfortunately this is more a theoretical assurance that perturbation 
theory can be given a mathematical meaning even though it does not converge, since 



in order to really compute the integral ( 47 ) one has to know the analytic continuation 



of g which requires knowledge of all coefficients a n and not just the first N. 



3 Regularization and renormalization as extensions of distri- 
butions 

In the previous section, we learned how to make sense of (some) divergent series 
of the form X^fcLo ak ~ °°> m QFT the factors are typically complicated 
mathematical expressions described by Feynman diagramms, and generically, these 
expressions diverge themselves, creating a need for renormalization techniques. 
A typical example of a divergent diagramm (in 4 dimensions) is shown in Figure |4j 
The term described by this diagramm reads (without unimportant factors): 

f k 3 dk 

J k 2 + m 2 (pi + P2 + k) 2 + m 2 J k 4 

where m denotes the mass of the scalar particles we are scattering. This integral 
obviously diverges logarithmically for k — > oo as shown above. The most straight- 
forward approach to this problem is to introduce a cut-off energy-scale A, such that 
the divergence at the upper boundary becomes 

f A k 3 dk ... 
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Figure 4: Divergent 1-loop diagramm 



Usually, such blunt cut-oft regularization is incompatible with the symmetries of the 
theory at hand and is thus only useful to estimate "how divergent" a diagram is (a 
notion we will below formalize as the "singular degree") and has to be replaced by 
more sophisticated methods like dimensional or Pauli-Villars regularization in more 
practical applications. 

In these notes, instead of momentum representation, we will work in position 
space where instead of loop momenta one integrates over the position of the inter- 
action vertices. 

What was l/(fc 2 + m 2 ), is now the propagator G defined by the equation 

{a + m 2 )G(x) =S(x), (49) 

(j){x) (p(y) 




6{x) (j)(y) 
Figure 5: Divergent 1-loop diagramm in position-space language 



we can compute the same diagramm in position space language, which then reads 
(see Figure [5]): 

d 4 x / d 4 y <t) 2 (x)G 2 {x - y)<f> 2 {y) = [ d 4 x [ d 4 u <j) 2 Q {x)G 2 (u)cj> 2 (x ~ u) (50) 



The approach of "causal perturbation theory" or "Epstein-Glaser regularization" 
is to take seriously the fact that the propagator is really a distribution and in 
the above expression, we are trying to multiply distributions which in general is 
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undefined. This approach is advocated in the book by Scharf [5]. Here, we will 
follow (a simplified, flat space version) of [5] and in particular [7]. 

Specifically, in the defining equation ( 49 ) i5 is not a function but a distribution 
(physicists writing d(x) are trying to imply that this is the kernel of the distribution 
5, i.e. that 5 arises by muliplying the testfunction by a function S(x) and then inte- 
grating over x, which of course does not exist). Thus, we should interpret G(x) as 
a distribution as well (a priori it is only a weak solution of the differential equation 
(49 1). But as it is in general not possible to multiply distributions, as we will see 
later, we do not have a naive way to obtain "G 2 " as a distribution. In this chapter, 
our goal will be to understand renormalization techniques in terms of distributions. 
Our route will be led by the question how to define the product of two distribu- 
tions that are almost everywhere functions (which can be multiplied). We will first, 
therefore, recapitulate what distributions actually are. Then, we will see in which 
cases it is possible to multiply distributions and in which it is not. This will lead us 
to the renormalization techniques we are searching for. 



3.1 Recapitulation of distributions 

Distributions are generalized functions. Like in many other cases of generalizations 
this is done via dualization: Starting from an ordinatry function / (in our case 
locally integrable, that is /: R n — > <D with J K \f\ < oo for each compact K C R™, 
so that "divergence of the integral J \ f\ at infinity" is tolerated) one can view it as 
a linear functional Tf (called a "regular distribution") on the functions of compact 
support via 

T f : <t> h> j f<f>. (51) 

As the map / M> Tf is injective we can use the T/'s to distinguish the different /'s 
and view Tf in place of /. This suggests to generalize the construction to all linear 
functionals T : <j) i— > T(<fi) called distributions of which the regular ones arising from 
functions / as above are a subset. 

Specifically, distributions are defined to be linear and continuous functionals on 
the space of test functions DiW 1 ) — Cg°(E. n ) (the subscript meaning compact 
support) equipped with an appropriate topology that will not concern us here. So 
formally, we can denote the distributions to be elements of a space: 

D'(R n ) = {T : D(R n ) -> C | T is linear and continuous} 

Besides the regular distibutions Tf encountered above (of which the function / is 
called the kernel) the typical example of a singular distribution is the ^-distribution: 
if we take a given test function <fi(x) £ D, then the ^-distribution is defined to be 
the functional 6[(f>] = (f)(0). This distribution is not regular even though physicists 
pretend it to be with a kernel 5(x) that is so singular at x — that J 5(x) = 1 even 
though it vanishes for all i^O. 

Later, we will make use of the fact that distributions can be differentiated. Using 
integration by parts in the integral representation of a regular distribution, we easily 
obtain Tf/[(f>] = —Tf[<f>'\ which enables us to define the derivative of a distribution 
to be T'[(p] = —T[<p']. Thus we can take the derivative of a regular distribution Tf 
even if the kernel / is not differentiable. 

The only operation defined on functions that does not directly carry over to 
distributions is (pointwise) multiplication (/ • g)(x) — f(x)g(x). Already Lj oc is not 
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closed under multiplication (recall that in order for a function to be in Lj oc it must 
not have singularities that go like l/x a with a > 1, a property not stable under 
multiplication) and in general the product of distributions is not defined. Of course, 
as long as with / and g also / ■ g € L\ oc we still have the regular distribution Tf. g 
and, from a technical perspective, in this sections, we will deal with the problem 
to extend a distribution that can be written as Tf. g for a subset of test functions 
(those that vanish where / • g is too singular to be in Lj oc ) to all test functions. 

To this end, for any distribution T 6 D' we define the singular support of T 
(singsupp(T)) as the smallest closed set in R" such that there exists a function 
/ € L\ oc with T[4>] = T f [4>] for all <p G D with supp0 n singsupp(T) = 0. For 
example singsupp((5) = {0}, and the corresponding function / g L\ oc is simply 
f(x) — 0. So the idea behind this definition is that every distribution can be 
written as regular distributions as long as it is only applied to test functions which 
vanish in a neighbourhood of the distribution's singular support, which enables us 
to multiply distributions if we manage to take care of the singular support. 



3.2 Definition of G 2 in £>'(R 4 \{0}) 

Coming back to our concrete field-theoretic problem for a moment, we now have to 
answer one important question: what is the singular support of G? Here, we can 
utilize the fact that □ + m 2 is an elliptic operatoiQ and it can be shown that if 
two distributions T and S are related by oT — S with an elliptic operator a, then 
singsupp(T) C singsupp(S'), so we immediately see singsupp(G) C {0}. 

It is clear that the singularity of G(x) at x = corresponds to the divergence 
of the momentum- integral at high energies A — ¥ oo, because in order to probe 
small distances, short wavelengths which correspond to high momenta are needed, 
ther efore we speak of UV-divergencies. In this regime, we can set m 2 ~ 0, and so 
|49} simplifies to UG{x) = 8{x) => G(x) ~ ^ for small 

Using this, we see that the position-space integral f, , i d 4 x G 2 (x) again diverges 

as log(A). Because of this divergence G 2 (x) £ Lj^lR 4 ), so G 2 is still not defined as 
distribution in _D'(R 4 ). Nevertheless, we can use G 2 (x) as kernel of a distribution 
in D'(R 4 \{0}) = {T : £>(]R 4 \{0}) C | linear and continuous} where D(M 4 \{0}) 
is the set of test functions with {0} ^ supp0. 

We now managed to define a distribution G 2 , but we still have to extend it from 
-D(M 4 \{0}) to Z?(IR 4 ). Formally, as a linear map, we have to say what values the 
extension takes on Z)(R 4 )/£)(R 4 \ {0}) which is still an infinite dimensional vector 
space. To control this infinity, we will use the scaling degree. 



3.3 The scaling degree and extensions of distributions 

Consider a scaling-map A acting on test functions: 

R >Q x D(R n ) -> D(R n ) 

(A,0) i ^ <j> x {x) = A-'XA- 1 ^) 

7 Explaining it without going into details, a differential operator which is defined as polynomial of 
d (with possible coordinate dependent coefficients) is elliptic if it is non-zero if we replace d with any 
non-zero vector y. In our euclidian examples, □ = X^=i — ^ \v\ 2 > f° v an y non-zero y 

8 The relation 5(x) = — iQ-ji is well known to hold in 3 dimensions. In general, D|a;j 2_n oc 5 in n 
dimensions 
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The pullback of this map to the space distributions reads 



(AT) [(f)] = T[<j>\] = T\[4>], 



which for regular distributions gives 



M = / ^r/W (j ) = / rf r ^/(Ax)0(x), 



so power of A in the scaling map acting on test functions is chosen such that the 
kernel / transforms in a simple manner without prefactor. We now define the scaling 
degree (sd) of T e D'(M C R n ): 



To understand this definition, we have to note several properties: 

• sd(T) e [—00, 00 [ 

• For regular distributions sd(Tf) < 

• sd(S) = n 

• sd(d a T) < sd(T) + \a\ with some multi-index a 

• sd(x a T) < sd(T) — \a\ with some multi- index 

• sdin + T 2 ) = max{sd(Ti), sd{T 2 )} 

This leads us to the following important theorem: 

Theorem 5 If Tq £ _D'(IR n \{0}) is a distribution with sd(To) < n, then there is a 
unique distribution T € Z?'(K") with sd(T) = sd(T ) extending T . 

The proof of uniqueness is quite easy: We do it by assuming the existence of two 
solutions T and T extending T , and showing a contradiction. Obviously supp(T — 
T) = {0} and from this it follows that T — T = P(d)S with some polynomial P. As 
can be seen from the above notes, sd(P(d)5) > n and this would be a contradiction 
to sd(T) = sd(T) = sd(To) < n. Existence is shown constructively using a smooth 
cut-off function c e (x) that is 1 outside a ball of radius 2e and and vanishes in a ball 
of radius e. Then we can define 



where one still has to show that the above limit exists in the sense of distributions. 

The theorem above now enables us to uniquely extend distributions of low scaling 
degree to the full space D'(M. n ), but what about distributions with scaling degree 
> n? We will solve this problem in the next section, and afterwards we will be able 
to return to our field-theoretic problem of understanding the nature of G 2 . 

But first we have to determine what the scaling degree of the massive propagator 
G, defined by S = (□ + m 2 )G. We know that sd(S) = n, and therefore sd((D + 
m 2 )G) — n too. If we denote sd(G) by w, from the above items it follows that 
sd(DG) = w + 2, sd(m 2 G) = w and therefore sd((D + m 2 )G) = w + 2. From this it 
follows that w = n — 2 even for the massive propagator. 

9 Remember that distributions do not depend on coordinates, only their kernels. Here we used 




T[4>] = limT o [c e 0], 



(52) 



definition {x a T)[4>\ = T[x a 4>} 
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3.4 Case of distributions with high scaling degree 

Considering now a distribution T e D'(R™\{0}) with sd(T ) > n, uniqueness as in 
the above theorem does not hold anymore. But if we take a test function 4> G D(R) 
which vanishes of order uj = sd(T ) — n ("singular order") at x = 0, i.e. which 
can be written as <p(x) = ^| a |=lwj+i xa< t>a{x) where 4> a (Q) is finite and \uj\ denotes 
the largest integer not bigger than oj, we can define T\<f>] = ^Z\ a \ = y u) ^ + i{x a T )[(j) a \. 
Then the distribution x a T has scaling degree less than n and can thus be uniquely 
extended. 

A general test function can of course be written as a sum of a function vanishing 
of order uj and a polynomial of degree at most w by subtracting and adding the 
order to Taylor polynomial at x = 0: 

<(> s (x) = <t>(x)- £ ( 53 ) 

\a\<oj 1 

This procedure of subtracting the terms leading to divergencies is the regularization 
in this framework. Since the extended distribution T beeing applied to 4> s is unique, 
by linearity, we still have to define T only on the monomials in x of maximal degree 
uj. There is no further restriction on doing this and this ambiguity in the extension 
T is what one would have expected: Changing the value of T on a monomial x a 
correspond to adding a multiple of d a S to T. 

Note well that the arbitratry values of T[x Q ] are exactly those where To[x Q ] 
was undefined (divergent in physicists' parlance) and selecting a certain value corre- 
sponds to picking a counter term, procedure known as renormalization, as formally 
infinite values are replaced by finte ones (that have to be fixed by further physical 
input like the measurement of the "physical mass" or the physical "coupling con- 
stant"). In the following small sections, we will try out this method in a few easy, 
concrete examples. 

3.4.1 Example in n = 1 

In order to let our steps so far become clearer, we are going to apply them to a 
simple example in n = 1. In fact, this example shows already the full regularization 
and renormalization procedure. 

As can be easily checked, the function f(x) = A- is not an element of Lj oc (M.) 
because of its pole at x — is not integrable (it is of course log-divergent), so we can 
not a priori use it as kernel of a distribution Tf e Z)'(R) as we have seen in section 
3.1. But f(x) e Lj oc (R\{0}) and sd(T f ) = 1 = n, therefore w = 0. This means that 
for a test function <j)(x) with 0(0) = we can define Tf[<j>] = J dx^^- which gives a 

finite result: Using l'Hopital's rule, we see \mi x ^±o = liut^-to s ^„(2) = finite 
and thus the integrand is finite everywhere. This is similar to what we have done 
in sections 3.2 and 3.3. For other test functions, we can again (as in this section 
above) define 4> s (x) = <f>(x) — 4>{0). Afterwards, we write the general extension for a 
distribution acting on <f> as Tf[<f>] = Tf[<f> s ] + c<p(0) with one arbitrary constant c of 
our choice. 

The careful reader will have realised that there is still a problem as <j) s fails to 
have compact support when (f>(0) ^ and thus the integration now diverges at 
the boundary x — > ±oo. We will deal with this problem below but the important 
observation is that the divergence in the ultraviolet, that is at small x is cured. 
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3.4.2 Example in n = 4 



In our field theoretic problem (50) from above, we have G 2 ~ ^ in R 4 which is 
quite similar to the previous example, as it is the kernel of a distribution Tqi — 
T i e D'(R 4 \ {0}). Again, we are looking for an extension. Once more, we have 

sd(G 2 ) = 4 = n. Regularization and renormalization are as in the example above 
and yield 



T\ [</>] = / d*x ^ ' / w + cJ[0] (54) 
* J x 

with e D'(M 4 ) and arbitrary c. A gain, we successfully got rid of the problems 

at x = (at the cost of introducing one constant c). 

This concludes our calculation of the fish diagram Fig. [4] that computes a con- 
tribution of the form cf>(x) 2 (j)(y) 2 to the effective action of the theory. Since the 
ambiguous term we found is c5{x — y), the ambiguity in the effective action is indeed 
4>{x) i 8{x — y). We see, that it corresponds to the counter term Fig. [6]and renormal- 
izes the coupling constant (the coefficient of the <f> -term in the action). 



00*0 0(iO 




00*0 0(y) 

Figure 6: Counter-term diagramm. 



3.4.3 Example with sd(T) > n and preservation of symmetry 

In the two examples above, we had both times distributions T with sd(T) = n which 
led us to the introduction of one arbitrary constant c. This amount of ambiguity 
increases with sd(T), but not all possible polynomials P(d)S allowed by the counting- 
can arise physically. In particular, we require that our theory is still Lorentz invariant 
after renormalization and if it has a local gauge symmetry before that needs to 
be maintained as well (otherwise one has an anomaly that renders the theory ill- 
defined at the quantum level since the number of degrees of freedom changes upon 
renormalization) . 

Let us consider one example where SO (4) invariance (the euclidian version of the 
Lorentz group SO(3,l)) selects a subset of the possible counter terms. 

In a theory with potential cx 4 (quartic interaction), there cannot only be 
diagramms like Figure [4j but also such ones like Figure [7J known as the setting sun 
diagram. 
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G(x - y) 




G(x - y) 

Figure 7: Setting sun diagramm in quartic interaction 



The term encoded by this diagramm obviously contains G 3 ~ -\, which has 
sd(G 3 ) — 6 > n — 4. By performing the same steps as above, in this case we a priori 
get an ambiguity c\8 + c l 2 diS + c^didjS with in total 1 + 4 + 4 ^ 4 2 l ~ 1 - > =15 arbitrary 
constants, but upon imposing SO(4)-invariance this reduces to c\S + C3A6 with only 
2 arbitrary constants. 

In the effective action, as above, they contribute to the quadratic terms (as 
the diagram Fig. [7] has two external lines) (f>(x)(c\8(x — y) + csA5(x — y))<f>(y) = 
(f) (x) (03 A + ci)(f>(x) 5 (x~y). We recognize that C3 is a wave function renormalization 
while ci renormalizes the mass-term m 2 (/) 2 . 

The fact that 4 -theory is renormalizablc in n = 4 means that these two counter 
terms and the one in the previous subsection are the only ambiguities that arise 
when any Feynman diagram of the theory is renormalized, a proof of being well 
beyond the scope of these notes. 



3.5 Regaining compact support and RG flow 



In the above calculations, we ignored an important problem: <f> s {x) = 4>{x) — </>(0) 
is not necessarily a test function, as it obviously has lim. E _j. 00 = —(f)(0), therefore 
for example the integral J dx ^ x ^~^ ^ that we encountered in section 3.4.1 may 
diverge at infinity. We can solve this by introducing a function w(x) € D(R n ) with 
(without loss of generality) w(0) — 1. We then change the regularized part (i.e. the 
part without arbitrary constants) of the integral in ( 54 ) to 



T-l [(f)] = dx 



(f>(x) - w(x)^ 



(0) 



This is a special case of the general formula 

— » x a w(x) 



4> s (x) = (f)(x) 



\a>\<cu 



w(x) 



x=0 



which replaces equation (53). Starting from ([55]) we can now write 

w(x)((f)(x) - (f)(0)) 



dx- 



dx 



(1 — w(x))(f)(x) 



(55) 
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The second term already is a perfectly fine distribution, the first term can be ma- 
nipulated in the following way: 

/ ^*M*)-*m = J dx ^ £ dunu) 

Now, in the inner integral we can substitute u — tx and afterwards interchange the 
integrals: 

J dx'^YY J dtx(p'(tx) = J dt J dx sign(x)w(x)(j)' (tx) 
After this, we can substitute y — tx in the inner integral, giving us 



L 



'*/*-(!) *(!)*<»> 

dy f jw (|) sign(y)0'(y) 



= T f [d<P] = -(dTf)[<j>}. 

So also the first term is a good distribution. The function f(y) defined above as 
function of y is well behaved, as w(x) which enters its definition is a test function, 
and therefore extremely well behaved, in particular vanishes for large arguments, so 
taking t — > docs not introduce problems. 

As an example, let us now set w(x) = 8(1 — M\x\) (or actually a smoothed out 
version of this non-continuous function) 



f(y) = j^6(l-M^y^(y) 

= f ^(l-MH)sign(y) 
Jm\v\ 1 



M\y\ 

= -ln(M\y\)6(l- M\y\) sign(y) 

Morally, we regularized our distribution with non-integrable kernel oc t^t by substi- 
tuting the derivative of a distribution with kernel oc log(|y|), which is integrable. 

In the above calculations, we introduced a mass/energy-scale M. It is now an 
important question to ask how the distribution changes under transformations of 
this scale, i.e. renormalization group (RG) transformations generated by M-gjj, so 
called RG flows. We will now show that it is only the part const ■ S(x), i.e. the part 
that is fixed by arbitrary renormalization constants that will change. 

First of all, using <9 x sign(a;) = 2S(x), we see: 

f'(x) = ^8 (1 - M\x\)sign(x) - log(M |a:|)0 (1 - M\x\) 28 (x) 
Then, we start with: 

rur 9 t Ul ut 9 ( f 'a - w(x))cj)(x) 

5M m = M m [J dx \x\ + T - dM 



21 



Because of 1 ?"\ x ^ = s ^ M j x j — il i n our example, the first term becomes a distribution 
with kernel 

d 0(M\x\ - 1) „ , , 
M 9M N J =^(MN-1). (56) 

The second term in contrast becomes a distribution with kernel 

M ^M { ~ f ' {x)) = ~ MS ( M \ X \ ~ l ) + M Jm t 2I °e(Af|x|)d (f - M|z|) *(»)] . 



The first term of this expression obviously cancels with the contribution from ( 56 ) , 

so M-J^jT i turns out to be a distribution with kernel: 

8M — { 

M^[2log(M\x\)9(l-M\x\)6(x)} 

= 26(x) [6 (f - M\x\) - log(M|x|)M<5 (1 - M\x\) \x\] 
= 26{x) 

In the last step, we used the presence of the factor S(x) (under an integral!) to set 
log(Af|x|)|a;| = and 9 (1 — M|x|) = 1. So, under a renormalization group transfor- 
mation, the distribution changes by ST cx const ■ S(x), that means that a change of 
energy-scale corresponds to a change of the (at the beginning) arbitrarily selected 
renormalization coefficients. 



3.6 What we have achieved in this section 

We have seen a way to recast what looks like divergent Feynman diagrams as to 
what looks like distributions for non-integrable functions. We could then turn these 
into proper distributions by first restricting the space of test-functions and then 
extend them to a full distribution, possibly at the price of a finite number of unde- 
termined numerical constants. Those have to be determined by a finite number of 
measurements. 

In order for the number of introduced parameters for all Feynman diagrams of the 
theory to be finite, the scaling degrees of all appearing distributions in all diagramms 
have to be below some maximum, otherwise the theory is not renormalizablc. 



4 Summary 

The material in these notes will not be useful for any concrete calculation in quantum 
field theory that a physicist might be interested in. But they might give him or her 
some confidence that the calculation envisaged has a chance to be meaningful. 

We tried to present material that is in no sense original but still is probably 
not covered in most introductions to quantum field theory. Hopefully, it helps 
to refute some of the prejudices against (perturbative) quantum field theory that 
mathematically minded people may have and helps others to better understand how 
far the hand waving arguments that we use in our daily work can carry. 

In particular, we put our emphasis on two points: Even if the perturbative 
expansion is divergent as a power series it can serve two purposes: The first terms do 
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provide a numerically good approximation to the true, non-perturbative result and 
all terms taken together can indeed recover the full result but only in terms of Borel 
resummation rather than as a power series. Second, unphysical infinite momentum 
integrals in the computation of Feynman diagrams can be avoided when properly 
expressed in terms of distributions. The renormalization of coupling constants is 
then expressed as the problem to extend a distribution from a subspace to all test 
functions. The language of distribution theory allows one to avoid mathematically 
ill-defined divergent expressions altogether. 
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